Di Luar Pencarian Dasar: Mengatasi Keterbatasan Similaritas Semantik

Di Luar Similaritas

Ketika "Masalah 80%"terjadi ketika pencarian semantik dasar bekerja dengan baik untuk pertanyaan sederhana tetapi gagal pada kasus-kasus ekstrem. Ketika kita hanya mencari berdasarkan kesamaan, penyimpanan vektor sering kali mengembalikan bagian-bagian yang paling mirip secara numerik. Namun, jika bagian-bagian tersebut hampir identik, LLM menerima informasi yang tumpang tindih, membuang-buang jendela konteks yang terbatas dan kehilangan perspektif yang lebih luas.

Pilar Pengambilan Data Lanjutan

Relevansi Maksimal Tambah (MMR):Alih-alih hanya memilih item yang paling mirip, MMR menyeimbangkan relevansi dengan keragaman untuk menghindari duplikasi informasi. $MMR = \text{argmax}_{d \in R \setminus S} [\lambda \cdot \text{sim}(d, q) - (1 - \lambda) \cdot \max_{s \in S} \text{sim}(d, s)]$
Self-Querying:Menggunakan LLM untuk mengubah bahasa alami menjadi filter metadata terstruktur (misalnya, filter berdasarkan "Lecture 3" atau "Sumber: PDF").
Kompresi Kontekstual:Mengecilkan dokumen yang diambil untuk mengekstrak hanya potongan-potongan "bernutrisi tinggi" yang relevan terhadap pertanyaan, sehingga menghemat token.

Jebakan Duplikasi Informasi

Memberikan tiga versi paragraf yang sama kepada LLM tidak membuatnya lebih cerdas—malah membuat prompt menjadi lebih mahal. Keragaman adalah kunci dari konteks yang "bernutrisi tinggi".

TERMINALbash — 80x24

> Ready. Click "Run" to execute.

Knowledge Check

You want your system to answer "What did the instructor say about probability in the third lecture?" specifically. Which tool allows the LLM to automatically apply a filter for { "source": "lecture3.pdf" }?

ConversationBufferMemory

Self-Querying Retriever

Contextual Compression

MapReduce Chain

Challenge: The Token Limit Dilemma

Apply advanced retrieval strategies to solve a real-world constraint.

You are building a RAG system for a legal firm. The documents retrieved are 50 pages long, but only 2 sentences per page are actually relevant to the user's specific query. The standard "Stuff" chain is throwing an OutOfTokens error because the context window is overflowing with irrelevant text.

Step 1

Identify the core problem and select the appropriate advanced retrieval tool to solve it without losing specific nuances.

Problem: The context window limit is being exceeded by "low-nutrient" text surrounding the relevant facts.

Tool Selection:ContextualCompressionRetriever

Step 2

What specific component must you use in conjunction with this retriever to "squeeze" the documents?

Solution: Use an LLMChainExtractor as the base for your compressor. This will process the retrieved documents and extract only the snippets relevant to the query, passing a much smaller, highly concentrated context to the final prompt.